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Abstract. The Press & Schechter "numerical recipe" is briefly reviewed, 
together with the recently proposed dynamical mass function theory, in 
which the mass function is constructed by using the powerful Lagrangian 
perturbation theory. The dynamical mass function is found in good agree- 
ment with the recent N-body simulations of Governato et al. (1998), in 
the case of an Einstein-de Sitter Universe. The definition of collapse, 
the relation between mass and smoothing radius, and the definition of 
structure in ID Universes are discussed. A detailed comparison of the 
dynamical mass function to simulations reveals that the orbit-crossed re- 
gions in the simulation are correctly reproduced, while the fragmentation 
of the collapsed medium into structures cannot be done in a univocal 
way. Finally, we try to answer the question: why the hell does the Press 
& Schechter work? 



1. Introduction 

The mass distribution of collapsed dark matter clumps (called mass function) 
is one of the most important pieces of information one wants to obtain from 
a cosmological model. These clumps are the sites where the most important 
astrophysical events take place; they are associated to proto-galactic halos, and 
to galaxy groups and clusters. 

Unfortunately, non-linear gravitational collapse is an unsolved problem in 
the general case, and the only way to get a fair estimate of the mass function is 
through N-body simulations or through (semi-) analytical approximations. 

The first approximation to the cosmological mass function was proposed. . . by 
Doroshkevich (1967), and later by Press & Schechter (1974; hereafter PS). The 
PS "numerical recipe" (as recently referred to by Efstathiou) can be summarized 
as follows: 



1. Take the initial density field, at a very early time, and smooth it on the 
scale R, such that the variance of the field is A = a 2 (R); we can't work 
with power at very small scales, which becomes non-linear very early. 

2. Follow the linear evolution of the density field: the density in each point 
is rescaled by the same time-dependent function b(t), the linear growing 
mode. 
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3. If the linearly evolved density gets as large as a threshold 5 C ~ 1, then 
the density in those points has gone non-linear! In other terms, it has 
collapsed. 

4. Compute the fraction of collapsed mass Q(< A) as the probability that 
the linearly evolved density is larger than the threshold S c . Underdensities 
(1/2 of mass with Gaussian statistics) in linear theory get more and more 
underdense, so no wonder that only 50% of mass is able to collapse; to 
achieve the correct normalization, multiply the result by two, and don't 
ask why! 

5. If the field is smoothed at a scale R, it is quite reasonable to assume 
that a typical mass M ~ QoR 5 will form (qq is the comoving average mass 
density). Then the mass function n(M) is computed through the following 
"golden rule" (as referred to by Cavaliere): 

dM (1) 

Despite its weaknesses, the PS numerical recipe works much better than 
expected. Governato et al. (1998) give the most recent comparison of the PS 
formula to large numerical simulations: the PS is a good approximation of the N- 
body curves (found with different clump-finding algorithms), for masses similar 
to or larger than the typical mass M m (z) which forms at a given redshift. The 
5 C parameter is not very different from the value 1.69 guessed by the spherical 
collapse model, except that it tends to get as small as 1.5 at z = 1 (for a standard 
CDM model). 

The PS did not receive much attention until 1988, when the first "large" N- 
body simulations of Efstathiou et al. (1988) and Carlberg & Couchman (1989) 
showed a good agreement with it. The mystery of the "fudge factor" of 2 was 
solved by Peacock & Heavens (1990) and Bond et al. (1991), who approached the 
"cloud-in-cloud" problem in a rigorous way (small structures can be included in 
larger collapsing ones, even though their density is not larger than the threshold, 
and taking this into account increases the total fraction of collapsed mass to 1). 
It was again a surprise to see that the PS formula, including the factor of 2, is 
exactly recovered if the linear density field is smoothed with a sharp filter in the 
/c-space. 

The history of the mass function theory is reviewed in Monaco (1998), but 
it can be effectively summarized in a sentence: there is a simple, effective and 
wrong way to describe the cosmological mass function. Wrong of course does 
not refer to the results but to the whole procedure. 

2. The mass function is an intrinsically Lagrangian quantity 

Once the "statistical" problem of achieving the correct normalization is solved, 
the worst defect of the PS recipe is that it completely neglects the complexities 
of gravitational dynamics, which is treated just at the linear level. 

The Lagrangian picture of fluid dynamics turns out to be an ideal frame 
in which to develop a semi-analytical theory for the mass function, able to go 



Mn(M)dM * Qo - — 
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beyond linear theory and spherical collapse. Before going on, it is appropriate 
to criticize the use of spherical collapse in cosmology, for a number of reasons: 

1. With spherical symmetry the collapse is clearly identified with the forma- 
tion of a singularity. In the more general case, the definition of collapse is 
quite tricky, as we are going to show in the following. 

2. Virialization is assumed to have taken place immediately after collapse. 
This neglects the many complicated transients which dominate structures 
like galaxy clusters. 

3. Further accretion is spherical. In the real case accretion comes preferen- 
tially from the filamentary network. 

The construction of a "dynamical" mass function is described in detail in 
Monaco (1995; 1997a,b) and reviewed in Monaco (1998). The dynamics of 
gravitational collapse is described though the powerful Lagrangian perturbation 
theory, truncated to third order (which is necessary to describe collapse, as 
the second order truncation gives wrong results in underdensities) . It is useful 
to recall that the first term in the perturbation series is the Zel'dovich (1970) 
approximation. If the infinitesimal mass element is described as a homogeneous 
ellipsoid, then ellipsoidal collapse turns out to be a useful truncation of the 
Lagrangian series. 

In the general case in which no symmetry is imposed to the initial density 
field, the definition of collapse itself is tricky. There is a natural way to define 
collapse, and it is when orbits start to cross each other (orbit crossing, hereafter 
OC). The typical pancake collapse takes place at OC; this means infinite density, 
infinite shear, shock waves in a subdominant baryon component, and, last but 
not least, no reliable solution after it! Then, as long as structures are searched 
for as high-density clumps, this definition makes sense; moreover, it marks the 
onset of multi-stream, highly non-linear dynamics and of shock-heating of gas. 

However, it might be argued that OC is too generous in mixing "real 
clumps" , as galaxy clusters, with the filamentary network existent outside clus- 
ters. If the mass element is modeled as an ellipsoid, then OC corresponds to the 
collapse on its first axis. As proposed for instance by Lee & Shandarin (1997), 
one might extrapolate the Lagrangian series so as to wait for the collapse of 
the element on the third axis, and then get the mass function of fully virialized 
clumps. The difference between the two collapse definitions (OC and third-axis 
collapse) is well illustrated in the following example. Consider a spherical peak 
in a homogeneous Universe, the profile of which is not a simple top-hat but de- 
creases from the center. While the central mass element collapses in a spherical 
way, all the other mass elements are subject to axial symmetry, so that their 
typical collapse is of the spindle kind: two axes collapse, the third one does 
not collapse at all. In the meantime each shell collapses in a perfectly spherical 
way. As a consequence, while the OC criterion selects all the collapsed mass el- 
ements, just one mass element collapses on the third axis. Then, the total mass 
of the spherically collapsing clump, according to the collapse definition based on 
third-axis collapse, is embarrassingly vanishing. 

This example highlights two important facts: (i) the local geometry of col- 
lapse is different from the global one; (ii) third-axis collapse is good for picking 
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Figure 1. Difference between orbit crossing and multi streaming. 



the seeds of collapsed structures, but not for obtaining their total mass, while 
OC can get the correct normalization, but mixes virialized clumps with fila- 
mentary transients. The second conclusion is confirmed by the fact that Lee & 
Shandarin (1997) need a fudge factor of 12.5 to achieve a good normalization 
for their mass function. 

Another limit of the OC definition is the following. Fig. 1 shows a ID sketch 
of a collapsing structure; q is the initial position of a particle (the Lagrangian 
coordinate) and x is its final position (the Eulerian coordinate). The triple 
valued region in the x(q) relation constitutes the whole multi-stream region, but 
the OC condition is able to select just the zones with negative slope in the x(q) 
relation. In other words, the OC condition misses the mass which is actually 
infalling on the structure, picking it only after its first crossing of the structure. 
This problem somehow compensates for the excessive generosity of OC in finding 
the collapsed mass: it just singles out matter which has had some time to relax! 
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Figure 2. The dynamical mass function compared to the N-body 
simulations of Governato et al. (1998) (see text for details). 



Note also that this problem is not present in the adhesion model, in which the 
caustics have an infinitesimal width: it is connected to the finite size of caustics. 

Once collapse is defined, one can compute directly from initial conditions 
the collapse time of each mass element. It is convenient to express time in terms 
of the linear growing mode b(t), (to factorize out the dependence on cosmology), 
and to define the following quantity: 

where b c is the collapse "time" for the element q. In the linear theory case, 
F = S/5 C . Note that, being F (the inverse of) a time, any threshold F c is just 
given by the inverse time at which the mass function is wanted. There are no 
free parameters in this theory. 

The F(q) function depends also on the smoothing radius used for the 
smoothing. To calculate the mass function with the correct normalization, it 
is possible to extend the "excursion set approach" of Bond et al. (1991) to the 
case of the non-Gaussian F process, smoothed with a sharp fc-space filter. The 
case of Gaussian smoothing is treated as in Peacock & Heavens (1990). Note 
that a PS-like approach (Monaco 1995), with no fudge factor, is enough to ob- 
tain a fair estimate of the mass function, as the fraction of points which will 
never collapse is not 50%, as in linear theory, but it is smaller than 8%. 

The shape of the filter should be chosen so as to optimize the dynamical 
predictions; the Gaussian filter is usually suggested. It is very interesting to see 
that the dynamical mass function gives meaningful results only in the case of 
Gaussian smoothing. In particular, it is very similar to a PS curve with a S c 
parameter about 1.5, smaller than the spherical 1.69 value. This is explained by 
the fact that the spherical collapse neglects tidal forces, which are able to speed 
up the collapse, thus forming larger objects. 

Fig. 2 shows a comparison of the dynamical mass function (with Gaussian 
smoothing) with the simulations of Governato et al. (1998); the left panel shows 
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the results for a flat Einstein-de Sitter Universe (standard CDM, h = 0.5), at 
redshift z = 1.86, 1.13, 0.43 and 0, while the right panel shows the results for an 
open Universe (O = 0.3, h = 0.75, no cosmological constant), at redshift z = 1, 
0.58, and 0. Groups are found with the standard friends-of-friends algorithm, 
with linking- length 0.2 (corrected in the open case as described in the paper); 
the authors have verified that the use of other clump-finding algorithms does not 
change much the results. The N-body mass functions are compared with the 
standard PS prediction (with 6 C = 1.69) and with the dynamical mass function, 
using ellipsoidal collapse. It must be noted that the 3rd-order prediction is 
known to underestimate the number of large-mass objects (Monaco 1997a): in 
this range the ellipsoidal prediction performs better. 

Despite the absence of free parameters to tune, the dynamical mass func- 
tion fits the N-body results remarkably well in the Einstein-de Sitter case: the 
agreement is improved with respect to the PS formula, with the possible ex- 
ception of the z = mass function which is slightly overestimated, while the 
z = 1.86 mass function is underestimated in the highest-mass tail. The trend 
of the best-fit delta c with redshift is not reproduced at this stage. On the other 
hand, the PS formula describes well the results of the open simulation, better 
than the dynamical mass function. Note also that the sharp /c-space predictions 
lie more than a factor two above the Gaussian curves: at variance with the PS 
theory, the sharp fe-space filter does not seem a good choice in this context. 

Another conclusion of the dynamical mass function theory is that the small- 
mass tail cannot be predicted in a robust way. Many factors hamper a reliable 
prediction of small-mass clumps; most importantly, the Lagrangian perturbation 
theory does not converge to a solution, and any error in the definition of large 
masses strongly influences the number of small-mass objects. 

3. From smoothing radius to mass: a ID perspective 

Once the mass function theory is founded on a solid dynamical basis, it is worth 
investigating in some detail the geometrical problem of going beyond the "golden 
rule" (Eq.|j). In other words, we want to obtain the distribution of the masses 
M of the objects which form at a given smoothing radius R, so as to relate the 
fraction of collapsed mass to the mass function in a rigorous way. 

It is very useful to consider this problem first in ID. Monaco & Murante 
(1998) have solved the mass function problem in ID in all its details; this is a 
very instructing game, which gives many precious hints on how to face the more 
complex 3D problem. It is useful to construct a function R c (q), which gives, 
for each point q in the Lagrangian space the largest smoothing radius R c at 
which the point is predicted to collapse within a fixed time t. Fig. 3 shows some 
examples of these curve, in the case of a white noise power spectrum. Both the 
predictions based on Gaussian and sharp /c-space filters are shown. "Objects" 
can be easily associated to connected segments of collapsed points; the objects 
present at a given radius R are then given by the intersection of the R c {q) curve 
with a line of constant R. Note also that, being R c the largest smoothing radius 
at which collapse takes place, the collapsed regions are already corrected for the 
cloud-in-cloud problem. 
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Figure 3. Rc(q) curves in a ID Universe. 




Figure 4. Cumulative and differential growing curves. 



It is important to note that, when R is decreased, the collapsed objects show 
a typical behaviour: their mass (=length) starts growing, and then saturates to 
a certain value, remaining approximately stable for a significant range in R. 
This stabilization is remarkable, and necessary in order to define the mass of an 
object in any meaningful way. 

Under the hypothesis that the objects follow a well defined average growing 
curve G(R), it is possible to connect the fraction of collapsed mass with the 
abundance of objects of mass M in the following way: 

i r°° 

n(>R) = — Mn(M)G(R, M)dM (3) 
Qo Jo 

With some algebra it is possible to write Eq. (||) in the form of a convolution 
of the mass function with the differential growing curve. Fig. 4 shows the 
cumulative and differential growing curves in the Gaussian and sharp fc-space 
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cases (both M and R are expressed in terms of the variance A) ; while the shar k- 
space curves are not particularly nice, the Gaussian differential curve shows one 
pronounced and narrow peak. It is easy to see that the golden rule corresponds 
to a Heavyside function for the growing curve, or a Dirac 5 function for the 
differential one; in other words, the golden rule corresponds to approximating 
the peaks shown in Fig. 3 with many rectangles, the height (R) and width (M) 
of which are related. In the realistic case of a non-<5 differential growing curve, 
the mass function must be deconvolved from it. However, for most practical 
purposes the mean value of the differential growing curve is enough: the golden 
rule strikes back! even though this average value could be different from that 
inferred from the mass of the filter, and could depend on the power spectrum. 

Such a dependence of the average mass on the power spectrum is proba- 
bly at the origin of the observed decrease of S c with redshift in CDM Universes 
(Governato et al. 1998). The slope of the power spectrum, at the scale cor- 
responding to the onset of non-linearity, gets smaller (and negative) at higher 
redshift, which means that the smoothed density field gets more correlated. It 
is then quite natural to expect larger masses, i.e. a decrease of 5 C , at higher 
redshift. 

4. OC regions in N-body simulations 

The ID problem is also a good starting point for performing a detailed compar- 
ison of the mass function theory with simulations. Note also that in ID linear 
theory (with S c = 1) and Zel'dovich approximation are equivalent, and more- 
over they are an exact solution up to OC. What is predicted by the theory is 
not the mass function of "virialized" structures (the word "virialized" is widely 
misused in cosmology: we don't understand the physical processes which lead to 
virialization, and then we don't know how to quantify them), nor is it the mass 
function of halos found with the friends-of-friends or whatever algorithm. The 
predictions refer to regions which are in OC. It is then useful to find OC regions 
directly in the simulations, and to compare them to the theoretical predictions; 
in a second time, the relation between such OC regions and groups, as found by 
some clump-finding algorithm, can be investigated. 

OC regions in the simulations are found by smoothing the x(q) map (the 
final positions of particles as a function of their initial positions) in the La- 
grangian space, then considering all the points with negative slope as collapsed. 
The OC is then scale-dependent; one can construct the R c (q) curve also in this 
case, and compare it with the predicted one. From OC, it is possible to recon- 
struct the whole multi-stream (MS) regions, as the multi- valued regions in the 
x(q) relation. The R c curves for the OC and MS regions are shown in Fig. 3, 
together with the predicted ones. It can be concluded that 

1. The OC R c curve is well reproduced by the Gaussian curve. 

2. The sharp fc-space curve is remarkably different both from the Gaussian 
and from the OC curve; then, Gaussian smoothing is definitely preferred. 

3. Remarkably, the sharp /c-space curve is similar to the MS curve. 
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Figure 5. Countour levels of the R c curve in 3D (evaluated in a plane 
cut of the simulation box). Left: ellipsoidal collapse. Right: OC from 
the simulation. 

It is possible to extend this analysis to 3D. This has been done with a test 
P 3 M simulation with 64 3 particles on a 64 3 grid. In this case the OC regions 
are found by smoothing the vector field x(q), obtained from the output of the 
N-body simulation, in the Lagrangian space, then calculating the Jacobian de- 
terminant J(q) = det(dx a / dqb) in each point; the points with negative Jacobian 
determinant are in OC. The curves -R c (q) are then constructed. Fig. 5 shows 
the contour level of two R c curves, evaluated on a plane cut of the simulation 
box. The R c curves refer to the prediction of ellipsoidal collapse and to the OC 
in the simulation. Ellipsoidal collapse is able to reproduce the OC regions in 
simulations fairly well. On the other hand, linear theory (not shown)turns out 
to be not bad in predicting the largest structures, but misses completely the con- 
nected network, which gives origin to small-scale objects (we are using Gaussian 
smoothing, in which case the linear theory prediction is known to underestimate 
the mass function). 

5. The mass function does not exist! 

As in the ID case, the 3D R c curves tend to be, at large smoothing radii, a 
collection of isolated and simply connected regions, the mass of which saturates 
as R decreases. It is then possible to construct an average growing curve for 
them, which can then be used to solve the geometrical problem described above. 
However, when the variance becomes comparable to one, the collapsed regions 
become multiply connected in a very messy way. As a consequence, at variance 
with the ID case, there is no obvious way to define the mass of structures; in 
other words, it is not straightforward to construct an algorithm to fragment the 
collapsed medium into "structures". We are currently trying to find reliable 
fragmentation algorithms; the goodness of a fragmentation algorithm can be de- 
cided by seeing how well it reproduces the structures found in some other way. 
For instance, one could construct a fragmentation algorithm able to reproduce 
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the friends-of-friends halos, but then another algorithm will be needed to repro- 
duce, e.g., the DENMAX halos, which are slightly different. Moreover, every 
clump-finding algorithm is parametric (e.g., in the friends-of-friends algorithm 
the linking length must be fixed), and the results depend on the actual value 
used for the parameters. 

There is no univocal way to find halos in simulations, and there is no uni- 
vocal way to fragment the collapsed medium into structures. As a consequence, 
there is no univocal way to define the mass function! The only well-defined 
quantities are the R c curve shown in Figs. 3 and 5. 

Luckily, this "nihilist" position is exaggerated from a practical point of 
view: the largest structures are recovered more or less in a similar way by 
different clump-finding algorithms, and the problem comes as usual from the 
small-mass clumps, which come from the multiply-connected medium (and from 
the left-overs of the largest structures). Anyway, this implicit uncertainty must 
be considered when judging, for instance, the agreement of a mass function 
theory with a friends-of-friends curve, as that shown in Fig. 2. 

6. Work in progress 

The dynamical mass function theory is currently been analyzed and applied 
to many interesting problems. The comparison to 3D simulations will be soon 
performed with the large (360 3 points) simulations already used by Governato 
et al. (1998). 

Once a reliable fragmentation algorithm is found, it can be used to generate 
large catalogues of objects. The semi-analytical predictions can be extended to 
the spin and merging history of objects: the spin is acquired from tidal torques 
during the mildly non-linear evolution (Catelan & Theuns 1996), while merging 
histories are found through the excursion set approach (Lacey k, Cole 1993), 
even though the role of the filter in this case must be clarified. This can have 
some definite advantages with respect to the standard N-body simulations, as 
the semi-analytical calculations are much, much faster. It can be used, for 
instance, to simulate halos in deep pencil beams, or to find galactic halos in 
Hubble volume-sized realizations. 

7. Why the hell does the Press &: Schechter work? 

None of the problems presented in this paper have been considered by PS; 
nonetheless, their formula appears to work quite well. Table 1 contains some of 
the elements which should be taken into account to construct a fully rigorous 
theory for the mass function; it is indicated how these element influence the large 
mass tail of the mass function. We have seen that non-linear (and non-spherical) 
dynamics gives an increase of large mass objects because of tidal forces. But 
the PS is based on sharp fe-space smoothing, while Gaussian smoothing is pre- 
ferred; this leads to a decrease of large-mass objects. On the other hand, OC 
neglects the matter infalling on the structure, which is already in multi-stream 
regime; taking this into account increases the mass of objects. But OC mixes 
"virialized" clumps with filaments; taking them out of the game would of course 
decrease the mass of objects. The deconvolution procedure described in Section 
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3 would generally decrease the number of large mass objects, because of the 
width of the differential growing curve. Add to this the rather unpredictable 
value of the average M(R) relation. 



Table 1. Neglected elements in the PS 



Element 



Effect 
on large masses 



Dynamics 
Gaussian filtering 

MS vs OC 
Exclude filaments 
Deconvolution 
Average M(R) 



Increase 
Decrease 
Increase 
Decrease 
Decrease 
??? 



In practice, the PS numerical recipe is the simplest way to construct a mass 
function from the Gaussian statistics of the initial density. It is as if some kind 
of "central limit theorem" were in act: the PS works because it neglects so 
many things that they all tend to compensate each other! especially if some free 
parameter can be tuned to obtain good fits to numerical simulations. 
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